Using Tree Automata for XML Mining and Web Mining with Constraints

نویسندگان

  • Sandra de Amo
  • Cricia Zilda Felicio
چکیده

Most work on pattern mining focus on simple data structures like itemsets or sequences of itemsets. However, a lot of recent applications dealing with complex data like chemical compounds, protein structure, XML and Web Log databases and social network, require much more sophisticated data structures (trees or graphs) for their specification. Here, interesting patterns involve not only frequent object values (labels) appearing in the graphs (or trees) but also frequent specific topologies found in these structures. In a recent work, we have introduced the method CobMiner for tree pattern mining with constraints. CobMiner uses tree automata as a mechanism to specify user constraints over tree patterns. In this paper, we focus on applications of this method in two different contexts: XML Mining and Web Usage Mining. An extensive set of experiments executed over real data in these two fields allow us to conclude that CoBMiner is an efficient method for mining datasets whose elements are specified as trees.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tree pattern mining with tree automata constraints

Most work on pattern mining focuses on simple data structures such as itemsets and sequences of itemsets. However, a lot of recent applications dealing with complex data like chemical compounds, protein structures, XML and Web log databases and social networks, require much more sophisticated data structures such as trees and graphs. In these contexts, interesting patterns involve not only freq...

متن کامل

Constraint-based Tree Pattern Mining

Most work on pattern mining focus on simple data structures like itemsets or sequences of itemsets. However, a lot of recent applications dealing with complex data like chemical compounds, protein structure, XML and Web Log databases, social network, require much more sophisticated data structures (trees or graphs) for their specification. Here, interesting patterns involve not only frequent ob...

متن کامل

Optimizing Membership Functions using Learning Automata for Fuzzy Association Rule Mining

The Transactions in web data often consist of quantitative data, suggesting that fuzzy set theory can be used to represent such data. The time spent by users on each web page is one type of web data, was regarded as a trapezoidal membership function (TMF) and can be used to evaluate user browsing behavior. The quality of mining fuzzy association rules depends on membership functions and since t...

متن کامل

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

Introduction: Autism is a nervous system disorder, and since there is no direct diagnosis for it, data mining can help diagnose the disease. Ontology as a backbone of the semantic web, a knowledge database with shareability and reusability, can be a confirmation of the correctness of disease diagnosis systems. This study aimed to provide a system for diagnosing autistic children with a combinat...

متن کامل

Development of a Combined System Based on Data Mining and Semantic Web for the Diagnosis of Autism

Introduction: Autism is a nervous system disorder, and since there is no direct diagnosis for it, data mining can help diagnose the disease. Ontology as a backbone of the semantic web, a knowledge database with shareability and reusability, can be a confirmation of the correctness of disease diagnosis systems. This study aimed to provide a system for diagnosing autistic children with a combinat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007